120 research outputs found
MDL Denoising Revisited
We refine and extend an earlier MDL denoising criterion for wavelet-based
denoising. We start by showing that the denoising problem can be reformulated
as a clustering problem, where the goal is to obtain separate clusters for
informative and non-informative wavelet coefficients, respectively. This
suggests two refinements, adding a code-length for the model index, and
extending the model in order to account for subband-dependent coefficient
distributions. A third refinement is derivation of soft thresholding inspired
by predictive universal coding with weighted mixtures. We propose a practical
method incorporating all three refinements, which is shown to achieve good
performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200
Terveisiä huippuyliopistoista
Kirjoittaja kuvaa omasta näkökulmastaan eroja Suomen ja Yhdysvaltain huippuyliopistojen välillä, vierailtuaan muutaman kuukauden sekä Kalifornian yliopistossa Berkeleyssä että MIT:ssa. Tulivuorenpurkausten ja muiden esteiden jälkeen hän pääsee tapaamaan Turing-palkinnolla palkittua professori Barbara Liskovia kysyäkseen tältä salaisuutta menestykseen.Non Peer reviewe
Yksinkertainen on kaunista : Okkamin partaveitsi tilastollisessa mallinnuksessa
Yksinkertaisuus on vahva induktiivisen päättelyn periaate. Se on läsnä monessa arkielämän tilanteessa epäformaalina peukalosääntönä, jonka mukaan yksinkertaisin selitys on paras. Yksinkertaisuuden periaatetta, eli Okkamin partaveistä, voidaan soveltaa myös tilastollisen päättelyn pohjana. Sen formaali versio, niin sanottu lyhimmän kuvauspituuden periaate (MDL-periaate), asettaa vaihtoehtoiset hypoteesit paremmuusjärjestykseen sen mukaan, mikä niistä mahdollistaa aineiston lyhimmän kuvauksen, kun kuvaus sisältää myös itse hypoteesin. Kuvauspituuden määrittämiseksi sovelletaan informaatioteorian ja tiedon tiivistämisen menetelmiä. Esitän tässä kirjoituksessa joitakin informaatioteorian käsitteitä. Kirjoituksen jälkipuoliskolla käydään läpi MDL-periaatteen alkeita
Minimum description length revisited
This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model selection and averaging and hypothesis testing, as well as the first completely general definition of MDL estimators. Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC versus BIC and cross-validation versus Bayes can, to a large extent, be viewed from a unified perspective.Peer reviewe
Learning Locally Minimax Optimal Bayesian Networks
We consider the problem of learning Bayesian network models in a non-informative setting, where the only available information is a set of observational data, and no background knowledge is available. The problem can be divided into two different subtasks: learning the structure of the network (a set of independence relations), and learning the parameters of the model (that fix the probability distribution from the set of all distributions consistent with the chosen structure). There are not many theoretical frameworks that consistently handle both these problems together, the Bayesian framework being an exception. In this paper we propose an alternative, information-theoretic framework which sidesteps some of the technical problems facing the Bayesian approach. The framework is based on the minimax-optimal Normalized Maximum Likelihood (NML) distribution, which is motivated by the Minimum Description Length (MDL) principle. The resulting model selection criterion is consistent, and it provides a way to construct highly predictive Bayesian network models. Our empirical tests show that the proposed method compares favorably with alternative approaches in both model selection and prediction tasks.
- …